Transcriptomics: Lecture 1

Biotech 7005/Bioinf 3000
Frontiers of Biotechnology: Bioinformatics and Systems Modelling
The University of Adelaide

Author
Affiliation


Dr Stevie Pederson (They/Them)

Black Ochre Data Labs, The Kids Research Institute Australia

Published

September 1, 2025

Welcome To Country

I’d like to acknowledge the Kaurna people as the traditional owners and custodians of the land we know today as the Adelaide Plains, where I live & work.

I also acknowledge the deep feelings of attachment and relationship of the Kaurna people to their place.

I pay my respects to the cultural authority of Aboriginal and Torres Strait Islander peoples from other areas of Australia, and pay our respects to Elders past, present and emerging, and acknowledge any Aboriginal Australians who may be with us today

Introduction To Transcriptomics

Introduction

  • Postdoctoral Bioinformatician, Black Ochre Data Labs, Adelaide
  • Working in collaboration with members of the SA Aboriginal community
  • Multi-omics project to identify and address the underlying causes of high T2D rates and complications
    • Using genomics, epigenomics, transcriptomics and other layers
    • My focus is on the transcriptomics layer

Why Transcriptomics?

  • DNA can be described as being like a giant book of instructions
  • Some regions are defined as genes
    • Originally considered to be the basic unit of inheritance
    • Now commonly used to describe a region of DNA transcribed into RNA

Why Transcriptomics?

  • DNA \(\rightarrow\) mRNA \(\rightarrow\) Proteins
    • Commonly referred to as the Central Dogma of Biology
  • Proteins are the workhorses of the cell & body
    • Do most of the work, and are responsible for most of the structure
    • Examples like keratin (hair), haemoglobin (oxygen transport) etc
  • ncRNAs are also highly functional
    • Ribosomal RNA (rRNA) essential for translation from mRNA to Protein
    • microRNAs play a role in gene-regulation via mRNA stability
    • The lncRNA Xist coats the entire X chromosome during X inactivation

Why Transcriptomics?

Definition

Based on Wang, Gerstein, and Snyder (2009)

The transcriptome can be defined as the complete set of (RNA) transcripts in a cell, or a population of cells, for a specific developmental stage or physiological condition

  • Transcriptomics is simply the study of the transcriptome
  • Can be the entire RNA content of a cell (or cells) or a subset of molecules (e.g. mRNA, miRNA)

Why Transcriptomics?

Taken from Fang et al. (2015)
  • Most RNA is single-stranded but can have extremely complex structure
    • Shown is a 2kb region from the lncRNA Xist (17kb in total)
  • Also interacts with the antisense lncRNA Tsix

Why Transcriptomics?

  • Is a snapshot of the dynamic biological processes associated with a biological question
  • Use to make inference about these processes
    • Identify therapeutic targets for Cardiovascular Disease
    • Biomarkers for CAR-T cells
    • Key drivers of correlated gene networks
    • Early drivers of neurodegeneration in Alzheimers
  • Assumed to be low-level
    • DNA \(\rightarrow\) RNA \(\rightarrow\) Protein \(\rightarrow\) Metabolites, Signalling molecules, etc …

Why Transcriptomics?

  • Is the first molecular level where quantity becomes a key aspect
    • Highly-expressed, or low-expressed genes are important
    • Changes in response to stimulus impact gene expression levels
  • Much of the early transcriptomic analyses were quantitative
    • Sequence variation often captured at DNA-level
  • Now extending to transcript structure and modifications
    • Identification of fusion transcripts, RNA-methylation etc

Why Transcriptomics?

  • Early techniques were often using large numbers of cells
    • Often multiple cell types within a biological sample
  • Modern techniques are incredibly detailed
    • Single-Cell RNA characterises exact cell types and cell trajectories
    • Spatial transcriptomics used to identify co-located cells in tissue
    • Identify cell-cell signalling in situ

What Is Transcription

Definition

Transcription is the process of making an RNA copy of a gene sequence

Figure taken from 1 Licensed under CC-BY 4.0 by OpenStax

Steps of Transcription

  1. RNA polymerase binds to the promoter along with \(\geq1\) transcription factors
  1. RNA polymerase creates a transcription bubble
    • separates the two DNA strands, breaking hydrogen bonds between complementary DNA nucleotides.
  2. RNA polymerase adds RNA nucleotides
    • complementary to the antisense DNA strand.
  3. RNA sugar-phosphate backbone forms
  4. Hydrogen bonds of the RNA–DNA complex break freeing the newly synthesized RNA strand.

Steps of Transcription

If the cell is a eukaryotic cell

  1. RNA processing
    • This may include polyadenylation, capping and splicing
    • Occurs during (or immediately after) transcription
  2. RNA Localisation
    • The RNA may remain in the nucleus or exit to the cytoplasm through the nuclear pore complex
  • Eukaryotic mRNA, miRNA & snRNA transcription uses RNA Polymerase II
    • RNA Pol I: rRNA
    • RNA Pol III: tRNA, 5S RNA some small RNAs

Eukaryotic mRNA Processing

  • Nuclear mRNA have 5’ cap added
    • Protects single-stranded mRNA from degradation
    • Regulates nuclear export
    • Promotes translation
  • mRNAs are polyadenylated at the 3’ end
    • Also protects from degradation
    • Aids in transcription termination, export and translation
  • Introns are spliced out as required

Eukaryotic mRNA Processing

Taken from Shafee and Lowe (2017)

Alternate Transcripts and Isoforms

Image by the National Human Genome Research Institute

Transcriptome Resources

  • Reference Transcriptomes & Genomes are now commonly available
    • Incorporate experimentally derived & predicted sequences + loci
  • Gencode2 provide highest quality for mouse & human
    • Release 48 (GRCh38): 78,686 genes + 385,669 transcripts
  • Other organisms from Ensembl, RefSeq, UCSC etc
    • Zebrafish, Rat, Chicken, Drosophila, Wheat, Yeast, E. Coli etc
  • Sometimes we build novel transcriptomes from specific tissues
    • e.g. sea snake venom gland, shiraz fruit

Early Transcriptomics

Northern Blotting

  • Northern blot (Alwine, Kemp, and Stark 1977) extended DNA-based methods (i.e Southern blot)
    • Earliest single-gene method
  • Gel Electrophoresis then hybridisation with labelled probe
    • Requires some knowledge of RNA sequence
  • Images scanned \(\rightarrow\) Densitometric Analysis for crude quantitation
  • Possible for different isoforms to be detected
    • Sequence dependent

RT-qPCR

The CT values as actually estimated to a decimal value

  • “Gold-standard” for measurement of transcription levels
    • Single gene \(\implies\) not a high-throughput technique
  • Targets a single transcript region with specific primers to produce cDNA
    \(\rightarrow\) Polymerase Chain Reaction (PCR)
  • Each PCR cycle approximately doubles the target region
  • cDNA produced is identified using fluorophores
    • Fluorescence doubles with each cycle
  • Once fluoresence passes a detection threshold, the cycle number is recorded
    • Known as the Cycle Threshold (CT) value

RT-qPCR

A 10-fold dilution series

RT-qPCR

  • Higher CT values \(\implies\) lower numbers of target molecule at the beginning
  • These can be used to estimate and compare abundance levels (i.e. gene expression)
  • Is vulnerable to technical artefacts (e.g. pipetting variability)
  • Often includes one or more “housekeeper” genes thought to be stably expressed
  • CT values are then normalised to the housekeeper genes \(\implies \Delta C_T\)
    • log2 transformed values are used
  • Comparison between conditions is the change in \(\Delta C_T \implies \Delta\Delta C_T\)
  • Represents change on the log2 scale, i.e. log fold-change

Expressed Sequence Tags

  • The senior author on the EST paper was J Craing Ventner who played an important role in the Human Genome Project
  • The first attempt at capturing the larger transcriptome was ESTs (Adams et al. 1991)
  • Identified 609 human brain mRNA sequences
    • Selected for polyA-mRNA then reverse transcribed
    • Used random primers \(\rightarrow\) Sanger Sequencing
  • 10 years before the Human Genome Project
    • Gene discovery was a hot topic

Sanger Sequencing

Estevezj, CC BY-SA 3.0, via Wikimedia Commons

SAGE & CAGE

  • First high-throughput quantification method was Serial Analysis of Gene Expression (SAGE) (Velculescu et al. 1995)
  • mRNA \(\rightarrow\) cDNA using biotinylated primers
  • cDNA bound to beads (using biotin) & cleaved
  • 11mer “tags” were ligated into long sequenced using linker sequences
  • Sequenced using Sanger Sequencing
  • Deconvolution & counting

Thomas Shafee, CC BY 4.0, via Wikimedia Commons

SAGE & CAGE

  • The terminology of counting tags is still used by some manuals & software to represent counting of sequences
  • Was described as Digital Gene Expression (DGE)
    • DGE still used but can be confused with Differential Gene Expression
  • A variant called Cap Analysis of Gene Expression (CAGE) targeted the 5’ Cap
  • Heavily used by FANTOM project (Abugessaisa et al. 2020) to identify exact Transcription Start Sites (TSS)

Microarray Technology

References

Abugessaisa, Imad, Jordan A Ramilowski, Marina Lizio, Jesicca Severin, Akira Hasegawa, Jayson Harshbarger, Atsushi Kondo, et al. 2020. “FANTOM Enters 20th Year: Expansion of Transcriptomic Atlases and Functional Annotation of Non-Coding RNAs.” Nucleic Acids Research 49 (D1): D892–98. https://doi.org/10.1093/nar/gkaa1054.
Adams, Mark D., Jenny M. Kelley, Jeannine D. Gocayne, Mark Dubnick, Mihael H. Polymeropoulos, Hong Xiao, Carl R. Merril, et al. 1991. “Complementary DNA Sequencing: Expressed Sequence Tags and Human Genome Project.” Science 252 (5013): 1651–56. http://www.jstor.org/stable/2876333.
Alwine, J. C., D. J. Kemp, and G. R. Stark. 1977. Method for detection of specific RNAs in agarose gels by transfer to diazobenzyloxymethyl-paper and hybridization with DNA probes.” Proc. Natl. Acad. Sci. U.S.A. 74 (12): 5350–54.
Fang, Rui, Walter N Moss, Michael Rutenberg-Schoenberg, and Matthew D Simon. 2015. “Probing Xist RNA Structure in Cells Using Targeted Structure-Seq.” PLoS Genet. 11 (12): e1005668.
Shafee, Thomas, and Rohan Lowe. 2017. “Eukaryotic and Prokaryotic Gene Structure.” WikiJournal of Medicine, January. https://doi.org/10.15347/WJM/2017.002.
Velculescu, V. E., L. Zhang, B. Vogelstein, and K. W. Kinzler. 1995. Serial analysis of gene expression.” Science 270 (5235): 484–87.
Wang, Zhong, Mark Gerstein, and Michael Snyder. 2009. RNA-Seq: A Revolutionary Tool for Transcriptomics.” Nat. Rev. Genet. 10 (1): 57–63.

Footnotes

  1. https://openoregon.pressbooks.pub/mhccbiology102/chapter/transcription/↩︎

  2. https://www.gencodegenes.org/↩︎